Search CORE

273 research outputs found

Temporal Relational Reasoning in Videos

Author: A Gaidon
A Gaidon
BM Lake
GA Sigurdsson
L Pinto
L Wang
L Wang
Publication venue
Publication date: 24/07/2018
Field of study

Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species. In this paper, we introduce an effective and interpretable network module, the Temporal Relation Network (TRN), designed to learn and reason about temporal dependencies between video frames at multiple time scales. We evaluate TRN-equipped networks on activity recognition tasks using three recent video datasets - Something-Something, Jester, and Charades - which fundamentally depend on temporal relational reasoning. Our results demonstrate that the proposed TRN gives convolutional neural networks a remarkable capacity to discover temporal relations in videos. Through only sparsely sampled video frames, TRN-equipped networks can accurately predict human-object interactions in the Something-Something dataset and identify various human gestures on the Jester dataset with very competitive performance. TRN-equipped networks also outperform two-stream networks and 3D convolution networks in recognizing daily activities in the Charades dataset. Further analyses show that the models learn intuitive and interpretable visual common sense knowledge in videos.Comment: camera-ready version for ECCV'1

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Knowledge Distillation for Multi-task Learning

Author: BM Lake
R Caruana
V Badrinarayanan
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/09/2020
Field of study

Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address the imbalance problem, we propose a knowledge distillation based method in this work. We first learn a task-specific model for each task. We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models. As the task-specific network encodes different features, we introduce small task-specific adaptors to project multi-task features to the task-specific features. In this way, the adaptors align the task-specific feature and the multi-task feature, which enables a balanced parameter sharing across tasks. Extensive experimental results demonstrate that our method can optimize a multi-task learning model in a more balanced way and achieve better overall performance.Comment: We propose a knowledge distillation method for addressing the imbalance problem in multi-task learnin

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Adding New Tasks to a Single Network with Weight Transformations using Binary Masks

Author: A Mallya
BM Lake
I Kuzborskij
J Kirkpatrick
J Stallkamp
M McCloskey
M Ristin
Mathias Eitz
O Russakovsky
RM French
S Munder
S Thrun
T Mensink
Z Li
Publication venue
Publication date: 14/06/2018
Field of study

Visual recognition algorithms are required today to exhibit adaptive abilities. Given a deep model trained on a specific, given task, it would be highly desirable to be able to adapt incrementally to new tasks, preserving scalability as the number of new tasks increases, while at the same time avoiding catastrophic forgetting issues. Recent work has shown that masking the internal weights of a given original conv-net through learned binary variables is a promising strategy. We build upon this intuition and take into account more elaborated affine transformations of the convolutional weights that include learned binary masks. We show that with our generalization it is possible to achieve significantly higher levels of adaptation to new tasks, enabling the approach to compete with fine tuning strategies by requiring slightly more than 1 bit per network parameter per additional task. Experiments on two popular benchmarks showcase the power of our approach, that achieves the new state of the art on the Visual Decathlon Challenge

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Archivio della ricerca- Università di Roma La Sapienza

Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification

Author: BM Lake
J Schmidhuber
O Ronneberger
Olga Russakovsky
T-Y Lin
W Liu
Y LeCun
Publication venue
Publication date: 20/07/2020
Field of study

Popular approaches for few-shot classification consist of first learning a generic data representation based on a large annotated dataset, before adapting the representation to new classes given only a few labeled samples. In this work, we propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches. First, we obtain a multi-domain representation by training a set of semantically different feature extractors. Then, given a few-shot learning task, we use our multi-domain feature bank to automatically select the most relevant representations. We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training, which leads to state-of-the-art results on MetaDataset and improved accuracy on mini-ImageNet.Comment: ECCV'2

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Using AI to Enable Design for Diversity: A Perspective

Author: BM Lake
I Goodfellow
M Bianchin
M Bianchin
N Caber
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Inclusive design focuses on diversity. The contextualized user-sensitive design framework of the interaction system needs to analyze and deal with complex diversity factors, which challenges the traditional design process, tools, and methods. Therefore, new technological progress is needed to provide more innovation potential. The authors point out that the design process of smart products is evolving in response to uncertainty. In the future, diversity-oriented design will tend to allocate design resources and values in an algorithmic way rather than the compromised unity solution. This paper analyzes the limitations and potential of the application of AI technology represented by deep learning in diversity-oriented design practice and design research, puts forward the goal and direction of further research, and discusses the critical links of AI-enabled diversity design in interdisciplinary research environment

Crossref

Brunel University Research Archive

Augmenting Image Classifiers Using Data Augmentation Generative Adversarial Networks

Author: B Rosén
BM Lake
D Silver
G Hinton
H Shimodaira
J Salamon
K He
M Sugiyama
SJ Nowlan
V Mnih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/09/2018
Field of study

Crossref

Edinburgh Research Explorer

Adaptive task sampling for meta-learning

Author: A Krizhevsky
B Landau
BM Lake
C Zhang
D Csiba
GR Cross
L van der Maaten
M Aly
O Russakovsky
Y Freund
Y Freund
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 16/07/2020
Field of study

tru

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

BézierSketch: A Generative Model for Scalable Vector Sketches

Author: A Masood
B Klare
BM Lake
C De Boor
D Salomon
GE Hinton
L Rabiner
L Shao
M Revow
Q Yu
W Zheng
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/07/2020
Field of study

The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process. The landmark SketchRNN provided breakthrough by sequentially generating sketches as a sequence of waypoints. However this leads to low-resolution image generation, and failure to model long sketches. In this paper we present B\'ezierSketch, a novel generative model for fully vector sketches that are automatically scalable and high-resolution. To this end, we first introduce a novel inverse graphics approach to stroke embedding that trains an encoder to embed each stroke to its best fit B\'ezier curve. This enables us to treat sketches as short sequences of paramaterized strokes and thus train a recurrent sketch generator with greater capacity for longer sketches, while producing scalable high-resolution results. We report qualitative and quantitative results on the Quick, Draw! benchmark.Comment: Accepted as poster at ECCV 202

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

“Are Machines Better Than Humans in Image Tagging?” - A User Study Adds to the Puzzle

Author: AF Hayes
BM Lake
D Turnbull
K Krippendorff
M Everingham
M Everingham
O Russakovsky
PJ Phillips
T-Y Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

“Do machines perform better than humans in visual recognition tasks?” Not so long ago, this question would have been considered even somewhat provoking and the answer would have been clear: “No”. In this paper, we present a comparison of human and machine performance with respect to annotation for multimedia retrieval tasks. Going beyond recent crowdsourcing studies in this respect, we also report results of two extensive user studies. In total, 23 participants were asked to annotate more than 1000 images of a benchmark dataset, which is the most comprehensive study in the field so far. Krippendorff’s α is used to measure inter-coder agreement among several coders and the results are compared with the best machine results. The study is preceded by a summary of studies which compared human and machine performance in different visual and auditory recognition tasks. We discuss the results and derive a methodology in order to compare machine performance in multimedia annotation tasks at human level. This allows us to formally answer the question whether a recognition problem can be considered as solved. Finally, we are going to answer the initial question

Crossref

Springer - Publisher Connector

Repositorium für Naturwissenschaften und Technik

Observation of Kuznetsov-Ma soliton dynamics in optical fibre

Author: B Kibler
BM Lake
DH Peregrine
DR Solli
E Kuznetsov
G Van Simaeys
JM Dudley
K Hammani
KB Dysthe
M Erkintalo
MP Tulin
N Akhmediev
N Akhmediev
N Karjanto
PA Andrekson
T Kawata
VE Zakharov
VE Zakharov
VI Shrira
YC Ma
Publication venue: Nature Publishing Group
Publication date: 04/06/2012
Field of study

The nonlinear Schrödinger equation (NLSE) is a central model of nonlinear science, applying to hydrodynamics, plasma physics, molecular biology and optics. The NLSE admits only few elementary analytic solutions, but one in particular describing a localized soliton on a finite background is of intense current interest in the context of understanding the physics of extreme waves. However, although the first solution of this type was the Kuznetzov-Ma (KM) soliton derived in 1977, there have in fact been no quantitative experiments confirming its validity. We report here novel experiments in optical fibre that confirm the KM soliton theory, completing an important series of experiments that have now observed a complete family of soliton on background solutions to the NLSE. Our results also show that KM dynamics appear more universally than for the specific conditions originally considered, and can be interpreted as an analytic description of Fermi-Pasta-Ulam recurrence in NLSE propagation

HAL-uB

HAL - Université de Franche-Comté

Crossref

PubMed Central

The Australian National University